• https://www.howtoforge.com/replacing_hard_disks_in_a_raid1_array

  • Note: If your RAID array is degraded you will see something like this

cat /proc/mdstat 

Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] 
md0 : active raid1 sdb1[1] sda1[0](F)
      487104 blocks super 1.2 [2/1] [_U]
md1 : active raid1 sdb2[1] sda2[0](F)
      976142144 blocks super 1.2 [2/1] [_U]

== Get information about the hard drive(s) ==

sudo hdparm -I /dev/sda

/dev/sda:
 HDIO_DRIVE_CMD(identify) failed: Input/output error

... and there is the failed hard drive ... fml. A healthy drive will show output similar to:

sudo hdparm -I /dev/sdb

/dev/sdb:

ATA device, with non-removable media
    Model Number:       WDC WD10EZEX-00WN4A0                    
    Serial Number:      WD-WCC6Y2ZHZJAH
    Firmware Revision:  01.01A01
    Transport:          Serial, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6, SATA Rev 3.0
Standards:
    Used: unknown (minor revision code 0x001f) 
    Supported: 10 9 8 7 6 5 
    Likely used: 10
Configuration:
    Logical     max current
    cylinders   16383   0
    heads       16  0
    sectors/track   63  0
    --
    LBA    user addressable sectors:  268435455
    LBA48  user addressable sectors: 1953525168
    Logical  Sector size:                   512 bytes
    Physical Sector size:                  4096 bytes
    device size with M = 1024*1024:      953869 MBytes
    device size with M = 1000*1000:     1000204 MBytes (1000 GB)
    cache/buffer size  = unknown
    Form Factor: 3.5 inch
    Nominal Media Rotation Rate: 7200
Capabilities:
    LBA, IORDY(can be disabled)
    Queue depth: 32
    Standby timer values: spec'd by Standard, with device specific minimum
    R/W multiple sector transfer: Max = 16  Current = 16
    DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6 
         Cycle time: min=120ns recommended=120ns
    PIO: pio0 pio1 pio2 pio3 pio4 
         Cycle time: no flow control=120ns  IORDY flow control=120ns
Commands/features:
    Enabled Supported:
       *    SMART feature set
            Security Mode feature set
       *    Power Management feature set
       *    Write cache
       *    Look-ahead
       *    Host Protected Area feature set
       *    WRITE_BUFFER command
       *    READ_BUFFER command
       *    NOP cmd
       *    DOWNLOAD_MICROCODE
            Power-Up In Standby feature set
       *    SET_FEATURES required to spinup after power up
            SET_MAX security extension
       *    48-bit Address feature set
       *    Device Configuration Overlay feature set
       *    Mandatory FLUSH_CACHE
       *    FLUSH_CACHE_EXT
       *    SMART error logging
       *    SMART self-test
       *    General Purpose Logging feature set
       *    64-bit World wide name
       *    {READ,WRITE}_DMA_EXT_GPL commands
       *    Segmented DOWNLOAD_MICROCODE
       *    Gen1 signaling speed (1.5Gb/s)
       *    Gen2 signaling speed (3.0Gb/s)
       *    Gen3 signaling speed (6.0Gb/s)
       *    Native Command Queueing (NCQ)
       *    Host-initiated interface power management
       *    Phy event counters
       *    NCQ priority information
       *    READ_LOG_DMA_EXT equivalent to READ_LOG_EXT
       *    DMA Setup Auto-Activate optimization
       *    Software settings preservation
       *    SMART Command Transport (SCT) feature set
       *    SCT Write Same (AC2)
       *    SCT Features Control (AC4)
       *    SCT Data Tables (AC5)
            unknown 206[12] (vendor specific)
            unknown 206[13] (vendor specific)
       *    DOWNLOAD MICROCODE DMA command
       *    WRITE BUFFER DMA command
       *    READ BUFFER DMA command
Security: 
    Master password revision code = 65534
        supported
    not enabled
    not locked
        frozen
    not expired: security count
        supported: enhanced erase
Logical Unit WWN Device Identifier: 50014ee20d232e87
    NAA     : 5
    IEEE OUI    : 0014ee
    Unique ID   : 20d232e87
Checksum: correct

or an alternative command

sudo lshw -class disk -class storage


... etc

  *-scsi:0
       physical id: 1
       logical name: scsi0
       capabilities: emulated
     *-disk
          description: SCSI Disk
          physical id: 0.0.0
          bus info: scsi@0:0.0.0
          logical name: /dev/sda
          size: 931GiB (1TB)
          configuration: logicalsectorsize=512 sectorsize=512
  *-scsi:1
       physical id: 2
       logical name: scsi1
       capabilities: emulated
     *-disk
          description: ATA Disk
          product: WDC WD10EZEX-00W
          vendor: Western Digital
          physical id: 0.0.0
          bus info: scsi@1:0.0.0
          logical name: /dev/sdb
          version: 1A01
          serial: WD-WCC6Y2ZHZJAH
          size: 931GiB (1TB)
          capabilities: partitioned partitioned:dos
          configuration: ansiversion=5 logicalsectorsize=512 sectorsize=4096 signature=000e5fab


... etc

and another (possibly the best solution) program;

sudo smartctl -d ata -a -i /dev/sdb


... etc


=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Blue
Device Model:     WDC WD10EZEX-00WN4A0
Serial Number:    WD-WCC6Y2ZHZJAH
LU WWN Device Id: 5 0014ee 20d232e87
Firmware Version: 01.01A01
User Capacity:    1,000,204,886,016 bytes [1.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Mon Jan 30 13:43:46 2017 EST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled


... etc

== Remove the failed disk from the RAID1 array ==
In this example, I have 2 disks - /dev/sda and /dev/sdb with 2 partitions on each disk /dev/sda1, /dev/sda2, /dev/sdb1, /dev/sdb2 - and /dev/sda has died. Here is what mdstat looks like

mdadm --manage /dev/md0 --fail /dev/sda1
mdadm --manage /dev/md1 --fail /dev/sda2

mdadm --manage /dev/md0 --remove /dev/sda1
mdadm --manage /dev/md0 --remove /dev/sda2

shutdown -h now
  • Physically remove the broken disk and install the new one

== Add the new disk to the RAID1 array ==
In this example, I have 2 disks - /dev/sda and /dev/sdb - and /dev/sda has died. Here is what mdstat looks like

cat /proc/mdstat 
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] 
md1 : active raid1 sdb2[1]
      976142144 blocks super 1.2 [2/1] [_U]

md0 : active raid1 sdb1[1]
      487104 blocks super 1.2 [2/1] [_U]

unused devices: 

== Mirror the partitioning scheme from the RAID disk to the new disk ==

sfdisk -d /dev/sdb | sfdisk /dev/sda

== Add the new disk to the array ==

mdadm --manage /dev/md0 --add /dev/sda1 
mdadm --manage /dev/md1 --add /dev/sda2 

== Check the recovery status ==

cat /proc/mdstat 
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] 
md1 : active raid1 sda2[2] sdb2[1]
      976142144 blocks super 1.2 [2/1] [_U]
      [>....................]  recovery =  0.3% (3282752/976142144) finish=103.7min speed=156321K/sec

md0 : active raid1 sda1[2] sdb1[1]
      487104 blocks super 1.2 [2/2] [UU]

unused devices: 

== Install GRUB to the MBR of the new disk ==

grub-install /dev/sda